1,263 research outputs found

    Using Growing Self-Organising Maps to Improve the Binning Process in Environmental Whole-Genome Shotgun Sequencing

    Get PDF
    Metagenomic projects using whole-genome shotgun (WGS) sequencing produces many unassembled DNA sequences and small contigs. The step of clustering these sequences, based on biological and molecular features, is called binning. A reported strategy for binning that combines oligonucleotide frequency and self-organising maps (SOM) shows high potential. We improve this strategy by identifying suitable training features, implementing a better clustering algorithm, and defining quantitative measures for assessing results. We investigated the suitability of each of di-, tri-, tetra-, and pentanucleotide frequencies. The results show that dinucleotide frequency is not a sufficiently strong signature for binning 10 kb long DNA sequences, compared to the other three. Furthermore, we observed that increased order of oligonucleotide frequency may deteriorate the assignment result in some cases, which indicates the possible existence of optimal species-specific oligonucleotide frequency. We replaced SOM with growing self-organising map (GSOM) where comparable results are obtained while gaining 7%–15% speed improvement

    Binning sequences using very sparse labels within a metagenome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In metagenomic studies, a process called binning is necessary to assign contigs that belong to multiple species to their respective phylogenetic groups. Most of the current methods of binning, such as BLAST, <it>k</it>-mer and PhyloPythia, involve assigning sequence fragments by comparing sequence similarity or sequence composition with already-sequenced genomes that are still far from comprehensive. We propose a semi-supervised seeding method for binning that does not depend on knowledge of completed genomes. Instead, it extracts the flanking sequences of highly conserved 16S rRNA from the metagenome and uses them as seeds (labels) to assign other reads based on their compositional similarity.</p> <p>Results</p> <p>The proposed seeding method is implemented on an unsupervised Growing Self-Organising Map (GSOM), and called Seeded GSOM (S-GSOM). We compared it with four well-known semi-supervised learning methods in a preliminary test, separating random-length prokaryotic sequence fragments sampled from the NCBI genome database. We identified the flanking sequences of the highly conserved 16S rRNA as suitable seeds that could be used to group the sequence fragments according to their species. S-GSOM showed superior performance compared to the semi-supervised methods tested. Additionally, S-GSOM may also be used to visually identify some species that do not have seeds.</p> <p>The proposed method was then applied to simulated metagenomic datasets using two different confidence threshold settings and compared with PhyloPythia, <it>k</it>-mer and BLAST. At the reference taxonomic level Order, S-GSOM outperformed all <it>k</it>-mer and BLAST results and showed comparable results with PhyloPythia for each of the corresponding confidence settings, where S-GSOM performed better than PhyloPythia in the ≥ 10 reads datasets and comparable in the ≥ 8 kb benchmark tests.</p> <p>Conclusion</p> <p>In the task of binning using semi-supervised learning methods, results indicate S-GSOM to be the best of the methods tested. Most importantly, the proposed method does not require knowledge from known genomes and uses only very few labels (one per species is sufficient in most cases), which are extracted from the metagenome itself. These advantages make it a very attractive binning method. S-GSOM outperformed the binning methods that depend on already-sequenced genomes, and compares well to the current most advanced binning method, PhyloPythia.</p

    Crystallization and X-ray Structure Determination of Cytochrome c_2 from Rhodobacter sphaeroides in Three Crystal Forms

    Get PDF
    Cytochrome c_2 serves as the secondary electron donor that reduces the photo-oxidized bacteriochlorophyll dimer in photosynthetic bacteria. Cytochrome c_2 from Rhodobacter sphaeroides has been crystallized in three different forms. At high ionic strength, crystals of a hexagonal space group (P6_122) were obtained, while at low ionic strength, triclinic (P1) and tetragonal (P4_12_12) crystals were formed. The three-dimensional structures of the cytochrome in all three crystal forms have been determined by X-ray diffraction at resolutions of 2.20 Å (hexagonal), 1.95 Å, (triclinic) and 1.53 Å (tetragonal). The most significant difference observed was the binding of an imidazole molecule to the iron atom of the heme group in the hexagonal structure. This binding displaces the sulfur atom of Met 100, which forms the axial ligand in the triclinic and tetragonal structures

    Metagenomic next-generation sequencing of samples from pediatric febrile illness in Tororo, Uganda.

    Get PDF
    Febrile illness is a major burden in African children, and non-malarial causes of fever are uncertain. In this retrospective exploratory study, we used metagenomic next-generation sequencing (mNGS) to evaluate serum, nasopharyngeal, and stool specimens from 94 children (aged 2-54 months) with febrile illness admitted to Tororo District Hospital, Uganda. The most common microbes identified were Plasmodium falciparum (51.1% of samples) and parvovirus B19 (4.4%) from serum; human rhinoviruses A and C (40%), respiratory syncytial virus (10%), and human herpesvirus 5 (10%) from nasopharyngeal swabs; and rotavirus A (50% of those with diarrhea) from stool. We also report the near complete genome of a highly divergent orthobunyavirus, tentatively named Nyangole virus, identified from the serum of a child diagnosed with malaria and pneumonia, a Bwamba orthobunyavirus in the nasopharynx of a child with rash and sepsis, and the genomes of two novel human rhinovirus C species. In this retrospective exploratory study, mNGS identified multiple potential pathogens, including 3 new viral species, associated with fever in Ugandan children

    Three-dimensional microCT imaging of mouse development from early post-implantation to early postnatal stages

    Get PDF
    AbstractIn this work, we report the use of iodine-contrast microCT to perform high-throughput 3D morphological analysis of mouse embryos and neonates between embryonic day 8.5 to postnatal day 3, with high spatial resolution up to 3µm/voxel. We show that mouse embryos at early stages can be imaged either within extra embryonic tissues such as the yolk sac or the decidua without physically disturbing the embryos. This method enables a full, undisturbed analysis of embryo turning, allantois development, vitelline vessels remodeling, yolk sac and early placenta development, which provides increased insights into early embryonic lethality in mutant lines. Moreover, these methods are inexpensive, simple to learn and do not require substantial processing time, making them ideal for high throughput analysis of mouse mutants with embryonic and early postnatal lethality

    Magnetotransport properties of a polarization-doped three-dimensional electron slab

    Full text link
    We present evidence of strong Shubnikov-de-Haas magnetoresistance oscillations in a polarization-doped degenerate three-dimensional electron slab in an Alx_{x}Ga1x_{1-x}N semiconductor system. The degenerate free carriers are generated by a novel technique by grading a polar alloy semiconductor with spatially changing polarization. Analysis of the magnetotransport data enables us to extract an effective mass of m=0.19m0m^{\star}=0.19 m_{0} and a quantum scattering time of τq=0.3ps\tau_{q}= 0.3 ps. Analysis of scattering processes helps us extract an alloy scattering parameter for the Alx_{x}Ga1x_{1-x}N material system to be V0=1.8eVV_{0}=1.8eV

    Cenozoic paleoceanography 1986: An introduction

    Get PDF
    New developments in Cenozoic paleoceanography include the application of climate models and atmospheric general circulation models to questions of climate reconstruction, the refinement of conceptual models for interpretation of the carbon isotope record in terms of carbon mass balance, paleocirculation, paleoproductivity, and the regional mapping of paleoceanographic events by acoustic stratigraphy. Sea level change emerges as a master variable to which changes in the ocean environment must be traced in many cases, and tests of the onlap-offlap paradigm therefore are of crucial importance

    Novel venom gene discovery in the platypus

    Get PDF
    Background: To date, few peptides in the complex mixture of platypus venom have been identified and sequenced, in part due to the limited amounts of platypus venom available to study. We have constructed and sequenced a cDNA library from an active platypus venom gland to identify the remaining components.Results: We identified 83 novel putative platypus venom genes from 13 toxin families, which are homologous to known toxins from a wide range of vertebrates (fish, reptiles, insectivores) and invertebrates (spiders, sea anemones, starfish). A number of these are expressed in tissues other than the venom gland, and at least three of these families (those with homology to toxins from distant invertebrates) may play non-toxin roles. Thus, further functional testing is required to confirm venom activity. However, the presence of similar putative toxins in such widely divergent species provides further evidence for the hypothesis that there are certain protein families that are selected preferentially during evolution to become venom peptides. We have also used homology with known proteins to speculate on the contributions of each venom component to the symptoms of platypus envenomation.Conclusions: This study represents a step towards fully characterizing the first mammal venom transcriptome. We have found similarities between putative platypus toxins and those of a number of unrelated species, providing insight into the evolution of mammalian venom
    corecore